The Bit Shift Paradox: How "Optimizing" Can Make Code 6× Slower
hackernoon.com·2d
SIMD Optimization
Multi-Core By Default
rfleury.com·17h·
🔩Systems Programming
An enough week
blog.mitrichev.ch·23h·
🧮Z3 Solver
Real-Time Adaptive Sparsity Optimization for Edge-Deployed AI Inference Accelerators
dev.to·9h·
Discuss: DEV
🌊Streaming Compression
Z8 G4 - 768gb RAM - CPU inference?
reddit.com·21h·
Discuss: r/homelab
🖥️Modern CPU
TRIM: Token-wise Attention-Derived Saliency for Data-Efficient Instruction Tuning
arxiv.org·1d
🔨Compilers
Fast Matrix Multiply on an Apple GPU
percisely.xyz·2d·
SIMD Vectorization
LLM Optimization Notes: Memory, Compute and Inference Techniques
gaurigupta19.github.io·4d·
Discuss: Hacker News
💻Local LLMs
GCC Patches Posted For C++26 SIMD Support
phoronix.com·8h
🔩Systems Programming
Profiling Your Code: 5 Tips to Significantly Boost Performance
usenix.org·17h
📊Performance Profiling
Server CPU: Clearwater Forest comes as Xeon 6+ with up to 288 cores
heise.de·23h
Nordic Processors
GoMem is a high-performance memory allocator library for Go
github.com·16h
🧠Memory Allocators
Intel reveals XeSS 3 with Multi-Frame Generation - and unlike Nvidia's MFG, it works on older GPUs
techradar.com·21h
🖥️Terminal Renaissance
Let's Write a Macro in Rust
hackeryarn.com·3h·
Discuss: Hacker News
🦀Rust Macros
AI and Deep Learning Accelerators Beyond GPUs in 2025
bestgpusforai.com·1d·
Discuss: Hacker News
Homebrew CPUs
Enhanced SoC Design via Adaptive Topology Optimization with Reinforcement Learning
dev.to·16h·
Discuss: DEV
🧩RISC-V
The Library Method: Understanding @cache
dev.to·17h·
Discuss: DEV
Cache Theory
Beating the L1 cache with value speculation (2021)
mazzo.li·4d·
CPU Microarchitecture
Just shipped Shimmy v1.7.0: Run 42B models on your gaming GPU!
reddit.com·1d·
Discuss: r/rust
🖥️Terminal Renaissance